Import the data sets extracted from the NaPPI_Data Preparation R Markdown.
## [1] "endpoint.txt" "NaPPI_DataAnalysis.html"
## [3] "NaPPI_DataAnalysis.Rmd" "plant_info.txt"
## [5] "S_timeseries.txt" "T_timeseries.txt"
## [7] "testtemplate" "testtemplatemod.gif"
## [9] "timeseries.txt"
We must convert the columns to factor and date formats.
This part extracts the variables in the endpoint dataframe.
## [1] "DW_shoot_g" "FW_shoot_g"
The variables for NaPPI are “DW_shoot_g” and “FW_shoot_g”
## # A tibble: 2 × 10
## variable n min max median iqr mean sd se ci
## <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 DW_shoot_g 125 17.7 117 33.9 9.51 35.9 12.3 1.10 2.19
## 2 FW_shoot_g 125 13.2 347. 177. 89.0 178. 67.1 6.00 11.9
## Genotype n
## 1 EPPN01_H 9
## 2 EPPN02_H 9
## 3 EPPN03_H 10
## 4 EPPN04_H 7
## 5 EPPN05_H 13
## 6 EPPN06_H 18
## 7 EPPN07_L 2
## 8 EPPN08_H 7
## 9 EPPN09_H 11
## 10 EPPN10_H 7
## 11 EPPN10_L 2
## 12 EPPN11_H 10
## 13 EPPN11_L 3
## 14 EPPN12_H 10
## 15 EPPN13_H 5
## 16 EPPN20_T 3
| Name | endpoint[, unlist(variabl… |
| Number of rows | 126 |
| Number of columns | 2 |
| _______________________ | |
| Column type frequency: | |
| numeric | 2 |
| ________________________ | |
| Group variables | None |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| DW_shoot_g | 1 | 0.99 | 35.93 | 12.35 | 17.73 | 28.79 | 33.94 | 38.30 | 117.0 | ▇▂▁▁▁ |
| FW_shoot_g | 1 | 0.99 | 177.54 | 67.14 | 13.23 | 130.80 | 177.25 | 219.75 | 346.9 | ▂▆▇▅▂ |
## Warning: Removed 1 rows containing non-finite values (`stat_boxplot()`).
## Removed 1 rows containing non-finite values (`stat_boxplot()`).
Remove the outliers, replacing them with NULL values and normality verification of residuals.
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(density)` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## 117 121
## 116 120
## [1] 47.30 49.56 40.96 27.85
## 31 105
## 30 104
## [1] 198.50 325.94 208.54 106.21
ATTENTION ICI CHANGER LES NOMS DES VARIABLES
## Warning: The `size` argument of `element_line()` is deprecated as of ggplot2 3.4.0.
## ℹ Please use the `linewidth` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: Removed 1 rows containing non-finite values (`stat_ydensity()`).
## Warning: Removed 1 rows containing non-finite values (`stat_sina()`).
## Warning: Removed 1 rows containing non-finite values (`stat_ydensity()`).
## Warning: Removed 1 rows containing non-finite values (`stat_sina()`).
ATTENTION ICI CHANGER LES NOMS DES VARIABLES
| Name | endpoint_clean[, unlist(v… |
| Number of rows | 126 |
| Number of columns | 2 |
| _______________________ | |
| Column type frequency: | |
| numeric | 2 |
| ________________________ | |
| Group variables | None |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| DW_shoot_g | 1 | 0.99 | 35.93 | 12.35 | 17.73 | 28.79 | 33.94 | 38.30 | 117.0 | ▇▂▁▁▁ |
| FW_shoot_g | 1 | 0.99 | 177.54 | 67.14 | 13.23 | 130.80 | 177.25 | 219.75 | 346.9 | ▂▆▇▅▂ |
## # A tibble: 16 × 4
## Genotype mean std.dev n_missing
## <fct> <dbl> <dbl> <int>
## 1 EPPN13_H 46.0 17.6 0
## 2 EPPN12_H 44.6 28.7 0
## 3 EPPN09_H 41.9 10.5 0
## 4 EPPN10_H 39.7 17.4 0
## 5 EPPN10_L 36.4 5.29 0
## 6 EPPN08_H 35.2 5.98 0
## 7 EPPN05_H 34.8 9.79 0
## 8 EPPN04_H 34.7 3.91 0
## 9 EPPN06_H 34.1 8.64 0
## 10 EPPN03_H 33.9 10.3 0
## 11 EPPN01_H 33.8 4.87 0
## 12 EPPN11_H 32.6 7.29 1
## 13 EPPN02_H 30.6 4.81 0
## 14 EPPN20_T 30.4 5.07 0
## 15 EPPN07_L 28.7 2.00 0
## 16 EPPN11_L 28.6 7.13 0
## # A tibble: 16 × 4
## Genotype mean std.dev n_missing
## <fct> <dbl> <dbl> <int>
## 1 EPPN13_H 222. 69.5 0
## 2 EPPN10_H 210. 69.0 0
## 3 EPPN08_H 204. 105. 0
## 4 EPPN09_H 194. 43.7 0
## 5 EPPN06_H 189. 85.2 0
## 6 EPPN12_H 182. 55.9 0
## 7 EPPN05_H 181. 71.8 0
## 8 EPPN04_H 181. 40.6 0
## 9 EPPN01_H 180. 45.3 0
## 10 EPPN07_L 165. 61.2 0
## 11 EPPN11_H 164. 63.1 1
## 12 EPPN02_H 145. 43.0 0
## 13 EPPN03_H 143. 79.2 0
## 14 EPPN11_L 133. 59.3 0
## 15 EPPN10_L 132. 40.5 0
## 16 EPPN20_T 132. 39.3 0
La variable explicative(X) sera le génotype, variable catégorielle. Les réponses(Y) sont les données phénotypiques (dans ce cas-ci la FW_shoot_g et la Measured_plant_height_cm)
## [1] EPPN20_T EPPN06_H EPPN08_H EPPN10_L EPPN05_H EPPN11_H EPPN09_H EPPN04_H
## [9] EPPN03_H EPPN12_H EPPN10_H EPPN01_H EPPN02_H EPPN11_L EPPN13_H EPPN07_L
## 16 Levels: EPPN01_H EPPN02_H EPPN03_H EPPN04_H EPPN05_H EPPN06_H ... EPPN20_T
## [1] "DW_shoot_g" "FW_shoot_g"
ATTENTION ICI CHANGER LES VARIABLES ### 1. First linear models Firstly, we model the Y = X + r + c + e Where - Y is the phenotypic trait; - X the genotype; - r the row effect (fixed or random); - c the column effect (fixed or random);
Models for DW_shoot_g and FW_shoot_g with fixed or random effects of Row and Column.
##
## Call:
## lm(formula = DW_shoot_g ~ Genotype + Row + Column, data = endpoint_clean)
##
## Residuals:
## Min 1Q Median 3Q Max
## -20.874 -6.271 -1.509 5.519 50.723
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 28.2396 7.7544 3.642 0.000467 ***
## GenotypeEPPN02_H -2.9902 6.1086 -0.490 0.625752
## GenotypeEPPN03_H -1.3912 5.8232 -0.239 0.811766
## GenotypeEPPN04_H 2.6873 6.4388 0.417 0.677484
## GenotypeEPPN05_H 0.8991 5.5722 0.161 0.872207
## GenotypeEPPN06_H 1.4508 5.2469 0.277 0.782832
## GenotypeEPPN07_L -8.1686 10.3948 -0.786 0.434176
## GenotypeEPPN08_H 2.7766 6.5691 0.423 0.673614
## GenotypeEPPN09_H 9.7024 5.8961 1.646 0.103592
## GenotypeEPPN10_H 4.2165 6.4622 0.652 0.515869
## GenotypeEPPN10_L 11.2486 10.7744 1.044 0.299475
## GenotypeEPPN11_H -4.4941 5.9263 -0.758 0.450374
## GenotypeEPPN11_L -8.0526 8.7896 -0.916 0.362208
## GenotypeEPPN12_H 13.1270 5.8280 2.252 0.026904 *
## GenotypeEPPN13_H 14.4846 7.5473 1.919 0.058357 .
## GenotypeEPPN20_T 2.3807 8.9236 0.267 0.790288
## Row2 -4.1285 4.3358 -0.952 0.343734
## Row3 -0.6911 4.2858 -0.161 0.872283
## Row4 -0.2334 4.0814 -0.057 0.954530
## Row5 0.5045 4.2127 0.120 0.904959
## Row6 1.4742 4.3074 0.342 0.733018
## Row7 10.4890 4.2989 2.440 0.016793 *
## Column2 9.9420 7.5028 1.325 0.188728
## Column3 11.7866 7.3113 1.612 0.110688
## Column4 -2.1079 7.0850 -0.298 0.766811
## Column5 -1.6712 7.6078 -0.220 0.826663
## Column6 7.8853 7.6197 1.035 0.303702
## Column7 10.6463 7.6410 1.393 0.167201
## Column8 6.2134 7.3557 0.845 0.400673
## Column9 14.4216 7.3664 1.958 0.053579 .
## Column10 4.6916 8.2359 0.570 0.570435
## Column11 -4.1446 7.9392 -0.522 0.603019
## Column12 3.0330 7.8764 0.385 0.701158
## Column13 -1.8445 7.4816 -0.247 0.805865
## Column14 5.9081 7.7465 0.763 0.447792
## Column15 4.9084 7.3706 0.666 0.507268
## Column16 -0.5763 7.8673 -0.073 0.941776
## Column17 1.1860 7.4026 0.160 0.873095
## Column18 4.3316 7.4252 0.583 0.561212
## Column19 1.6794 7.3640 0.228 0.820154
## Column20 1.2678 7.4573 0.170 0.865415
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 12.04 on 84 degrees of freedom
## (1 observation effacée parce que manquante)
## Multiple R-squared: 0.3557, Adjusted R-squared: 0.04887
## F-statistic: 1.159 on 40 and 84 DF, p-value: 0.2815
## Analysis of Variance Table
##
## Response: DW_shoot_g
## Df Sum Sq Mean Sq F value Pr(>F)
## Genotype 15 2632.5 175.50 1.2103 0.2805
## Row 6 1316.1 219.35 1.5128 0.1839
## Column 19 2775.3 146.07 1.0074 0.4617
## Residuals 84 12180.2 145.00
## boundary (singular) fit: see help('isSingular')
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: DW_shoot_g ~ Genotype + (1 | Row) + (1 | Column)
## Data: endpoint_clean
##
## REML criterion at convergence: 884.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.4674 -0.5997 -0.1208 0.4211 5.7901
##
## Random effects:
## Groups Name Variance Std.Dev.
## Column (Intercept) 5.573e-12 2.361e-06
## Row (Intercept) 4.200e+00 2.049e+00
## Residual 1.455e+02 1.206e+01
## Number of obs: 125, groups: Column, 20; Row, 7
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 33.8930 4.1012 100.9566 8.264 5.79e-13 ***
## GenotypeEPPN02_H -3.0693 5.6930 103.4879 -0.539 0.5910
## GenotypeEPPN03_H -0.1894 5.5567 104.6532 -0.034 0.9729
## GenotypeEPPN04_H 0.8041 6.0836 103.1277 0.132 0.8951
## GenotypeEPPN05_H 0.7398 5.2378 103.6519 0.141 0.8880
## GenotypeEPPN06_H 0.5284 4.9392 104.9669 0.107 0.9150
## GenotypeEPPN07_L -4.7999 9.4796 106.5084 -0.506 0.6137
## GenotypeEPPN08_H 1.4151 6.0931 104.4440 0.232 0.8168
## GenotypeEPPN09_H 7.9956 5.4401 105.2978 1.470 0.1446
## GenotypeEPPN10_H 5.7290 6.1102 106.4489 0.938 0.3506
## GenotypeEPPN10_L 3.1602 9.4969 107.5285 0.333 0.7400
## GenotypeEPPN11_H -1.5713 5.6993 104.4418 -0.276 0.7833
## GenotypeEPPN11_L -5.7453 8.0905 106.9824 -0.710 0.4792
## GenotypeEPPN12_H 10.8092 5.5510 103.8276 1.947 0.0542 .
## GenotypeEPPN13_H 12.4213 6.8267 108.9198 1.820 0.0716 .
## GenotypeEPPN20_T -3.2346 8.0643 104.8364 -0.401 0.6892
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation matrix not shown by default, as p = 16 > 12.
## Use print(x, correlation=TRUE) or
## vcov(x) if you need it
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## ANOVA-like table for random-effects: Single term deletions
##
## Model:
## DW_shoot_g ~ Genotype + (1 | Row) + (1 | Column)
## npar logLik AIC LRT Df Pr(>Chisq)
## <none> 19 -442.30 922.61
## (1 | Row) 18 -442.52 921.04 0.43533 1 0.5094
## (1 | Column) 18 -442.30 920.61 0.00000 1 1.0000
##
## Call:
## lm(formula = FW_shoot_g ~ Genotype + Row + Column, data = endpoint_clean)
##
## Residuals:
## Min 1Q Median 3Q Max
## -164.647 -39.774 8.291 37.825 146.473
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 132.5600 44.0975 3.006 0.00349 **
## GenotypeEPPN02_H -33.0756 34.7381 -0.952 0.34376
## GenotypeEPPN03_H -49.2577 33.1153 -1.487 0.14064
## GenotypeEPPN04_H -0.5520 36.6162 -0.015 0.98801
## GenotypeEPPN05_H -0.1711 31.6877 -0.005 0.99571
## GenotypeEPPN06_H 12.3357 29.8378 0.413 0.68035
## GenotypeEPPN07_L -20.8507 59.1128 -0.353 0.72518
## GenotypeEPPN08_H 19.0290 37.3572 0.509 0.61182
## GenotypeEPPN09_H 15.2931 33.5299 0.456 0.64949
## GenotypeEPPN10_H 29.7689 36.7492 0.810 0.42020
## GenotypeEPPN10_L -6.3234 61.2717 -0.103 0.91805
## GenotypeEPPN11_H -20.3117 33.7015 -0.603 0.54834
## GenotypeEPPN11_L -84.8333 49.9846 -1.697 0.09336 .
## GenotypeEPPN12_H 12.0368 33.1425 0.363 0.71738
## GenotypeEPPN13_H 21.1223 42.9196 0.492 0.62391
## GenotypeEPPN20_T 4.6765 50.7465 0.092 0.92679
## Row2 7.3553 24.6566 0.298 0.76620
## Row3 18.2423 24.3725 0.748 0.45626
## Row4 -11.1830 23.2103 -0.482 0.63119
## Row5 -18.7620 23.9567 -0.783 0.43573
## Row6 -2.3768 24.4955 -0.097 0.92293
## Row7 20.7756 24.4468 0.850 0.39784
## Column2 82.4323 42.6665 1.932 0.05673 .
## Column3 91.1612 41.5776 2.193 0.03111 *
## Column4 3.6414 40.2907 0.090 0.92820
## Column5 34.0494 43.2639 0.787 0.43349
## Column6 52.6270 43.3313 1.215 0.22795
## Column7 51.6726 43.4524 1.189 0.23772
## Column8 82.8466 41.8303 1.981 0.05092 .
## Column9 54.2173 41.8912 1.294 0.19913
## Column10 67.3053 46.8355 1.437 0.15442
## Column11 7.6135 45.1488 0.169 0.86649
## Column12 53.3238 44.7915 1.190 0.23721
## Column13 37.2885 42.5462 0.876 0.38330
## Column14 18.9326 44.0526 0.430 0.66846
## Column15 14.4763 41.9148 0.345 0.73068
## Column16 26.9097 44.7394 0.601 0.54914
## Column17 9.3940 42.0968 0.223 0.82396
## Column18 69.7947 42.2255 1.653 0.10208
## Column19 65.3619 41.8777 1.561 0.12234
## Column20 64.5650 42.4083 1.522 0.13165
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 68.48 on 84 degrees of freedom
## (1 observation effacée parce que manquante)
## Multiple R-squared: 0.2952, Adjusted R-squared: -0.04039
## F-statistic: 0.8796 on 40 and 84 DF, p-value: 0.6679
## Analysis of Variance Table
##
## Response: FW_shoot_g
## Df Sum Sq Mean Sq F value Pr(>F)
## Genotype 15 67131 4475.4 0.9544 0.5095
## Row 6 16220 2703.4 0.5765 0.7480
## Column 19 81644 4297.1 0.9164 0.5649
## Residuals 84 393900 4689.3
## boundary (singular) fit: see help('isSingular')
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: FW_shoot_g ~ Genotype + (1 | Row) + (1 | Column)
## Data: endpoint_clean
##
## REML criterion at convergence: 1256.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.83300 -0.62180 -0.01319 0.63786 2.35488
##
## Random effects:
## Groups Name Variance Std.Dev.
## Column (Intercept) 0 0.00
## Row (Intercept) 0 0.00
## Residual 4512 67.17
## Number of obs: 125, groups: Column, 20; Row, 7
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 179.8056 22.3895 109.0000 8.031 1.23e-12 ***
## GenotypeEPPN02_H -34.8700 31.6635 109.0000 -1.101 0.273
## GenotypeEPPN03_H -36.4046 30.8618 109.0000 -1.180 0.241
## GenotypeEPPN04_H 0.8844 33.8497 109.0000 0.026 0.979
## GenotypeEPPN05_H 1.2437 29.1262 109.0000 0.043 0.966
## GenotypeEPPN06_H 8.9206 27.4214 109.0000 0.325 0.746
## GenotypeEPPN07_L -14.5856 52.5080 109.0000 -0.278 0.782
## GenotypeEPPN08_H 23.7130 33.8497 109.0000 0.701 0.485
## GenotypeEPPN09_H 14.0154 30.1900 109.0000 0.464 0.643
## GenotypeEPPN10_H 30.1459 33.8497 109.0000 0.891 0.375
## GenotypeEPPN10_L -48.0706 52.5080 109.0000 -0.915 0.362
## GenotypeEPPN11_H -15.4622 31.6635 109.0000 -0.488 0.626
## GenotypeEPPN11_L -46.8089 44.7790 109.0000 -1.045 0.298
## GenotypeEPPN12_H 1.9964 30.8618 109.0000 0.065 0.949
## GenotypeEPPN13_H 42.1304 37.4648 109.0000 1.125 0.263
## GenotypeEPPN20_T -48.2056 44.7790 109.0000 -1.077 0.284
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation matrix not shown by default, as p = 16 > 12.
## Use print(x, correlation=TRUE) or
## vcov(x) if you need it
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
## Type III Analysis of Variance Table with Satterthwaite's method
## Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
## Genotype 67131 4475.4 15 109 0.992 0.4688
## boundary (singular) fit: see help('isSingular')
## boundary (singular) fit: see help('isSingular')
## ANOVA-like table for random-effects: Single term deletions
##
## Model:
## FW_shoot_g ~ Genotype + (1 | Row) + (1 | Column)
## npar logLik AIC LRT Df Pr(>Chisq)
## <none> 19 -628.29 1294.6
## (1 | Row) 18 -628.29 1292.6 0 1 1
## (1 | Column) 18 -628.29 1292.6 0 1 1
Model with X as Plant_type instead of Genotype, and row and column effects as random effects. Plant_type is defined as H for Hybrid, L for pure Line and T for Tester.
## boundary (singular) fit: see help('isSingular')
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: DW_shoot_g ~ Plant_type + (1 | Row) + (1 | Column)
## Data: endpoint_clean
##
## REML criterion at convergence: 967
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.5048 -0.5790 -0.1394 0.2229 6.4321
##
## Random effects:
## Groups Name Variance Std.Dev.
## Column (Intercept) 0.00 0.000
## Row (Intercept) 3.56 1.887
## Residual 149.49 12.227
## Number of obs: 125, groups: Column, 20; Row, 7
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 30.8187 7.1286 121.8536 4.323 3.16e-05 ***
## Plant_typeH 5.5773 7.1843 120.0304 0.776 0.439
## Plant_typeL 0.2128 8.4775 120.0526 0.025 0.980
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) Plnt_H
## Plant_typeH -0.982
## Plant_typeL -0.832 0.826
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')
## ASReml Version 4.2 24/05/2024 03:11:33
## LogLik Sigma2 DF wall
## 1 -343.3150 133.2052 109 03:11:33
## 2 -342.5665 138.0757 109 03:11:33 ( 1 restrained)
## 3 -342.1558 145.5915 109 03:11:33 ( 1 restrained)
## 4 -342.1411 145.5259 109 03:11:33 ( 1 restrained)
## 5 -342.1406 145.5289 109 03:11:33 ( 1 restrained)
## 6 -342.1405 145.5307 109 03:11:33 ( 1 restrained)
## component std.error z.ratio bound %ch
## Row 4.200342e+00 7.844122 0.5354764 P 0
## Column 6.708305e-06 NA NA B NA
## units!R 1.455307e+02 22.309571 6.5232389 P 0
PROBLEME DANS CE BLOC
Model with Soil as explicative variable.
PROBLEME DANS CE BLOC
PROBLEME DANS CE BLOC PCA, clustering, etc, voir p.56 biométrie 1
In this part, we look at the timeseries, S_timeseries and T_timeseries datasets.
REPLACER DANS LE CODE APRES
h1 <- ggplot(timeseries, aes(x = Date)) + geom_bar(aes(fill = Genotype), position = “stack”, width = 1) + scale_fill_viridis_d(option = “D”) + labs(x = “Date”, y = “Number of observations”, title = “Observations per day for timeseries_shoot_and_plant”) + scale_y_continuous(breaks = seq(from = 0, to = 325, by = 25)) + scale_x_date(date_breaks = “2 days”, date_labels = “%d-%m-%Y”) + # Exemple de format de date (%d-%m-%Y) theme(axis.text.x = element_text(angle = 45, hjust = 1), # Rotation des étiquettes des dates panel.grid.major.x = element_line(color = “lightgray”, size = 0.5), # Paramètres de la grille panel.grid.minor.x = element_blank()) # Supprimer les lignes de grille mineures
h3 <- ggplot(T_timeseries, aes(x = Date)) + geom_bar(aes(fill = Genotype), position = “stack”, width = 1) + scale_fill_viridis_d(option = “D”) + labs(x = “Date”, y = “Number of observations”, title = “Observations per day for T_timeseries_shoot”) + scale_y_continuous(breaks = seq(from = 0, to = 325, by = 25)) + scale_x_date(date_breaks = “2 days”, date_labels = “%d-%m-%Y”) + # Exemple de format de date (%d-%m-%Y) theme(axis.text.x = element_text(angle = 45, hjust = 1), # Rotation des étiquettes des dates panel.grid.major.x = element_line(color = “lightgray”, size = 0.5), # Paramètres de la grille panel.grid.minor.x = element_blank()) # Supprimer les lignes de grille mineures
combined <- h1 + h2 + h3 & theme(legend.position = “top”)
combined + plot_layout(guides = “collect”)
Firsty, we extract the variables of the timeseries dataframe.
## [1] "S_Height_cm" "S_Height_pixel" "S_Area_cmsquared"
## [4] "S_Area_pixel" "S_Perimeter_cm" "S_Perimeter_pixel"
## [7] "S_Compactness" "S_Width_cm" "S_Width_pixel"
## timePoint_endpoint contains data for experiment EPPN2020_NaPPI.
##
## It contains 1 time points.
## First time point: 2020-07-05
## Last time point: 2020-07-05
##
## No check genotypes are defined.
## timeNumber timePoint
## 1 1 2020-07-05
Count the number of observations per trait.
## [1] "How many observations for DW_shoot_g"
## 2020-07-05
## 125
## [1] "How many observations for FW_shoot_g"
## 2020-07-05
## 125
Check the layout at the only timepoint.
## timeNumber timePoint
## 1 1 2020-07-05
Check the heatmap of the raw data at harvest.
Count the number of observations per trait
## [1] "How many observations for DW_shoot_g"
## 2020-07-05
## 125
## [1] "How many observations for FW_shoot_g"
## 2020-07-05
## 125
Check the heatmap of the data at harvest.
## timePoint_S contains data for experiment EPPN2020_NaPPI.
##
## It contains 13 time points.
## First time point: 2020-06-17
## Last time point: 2020-07-05
##
## No check genotypes are defined.
## timeNumber timePoint
## 1 1 2020-06-17
## 2 2 2020-06-22
## 3 3 2020-06-23
## 4 4 2020-06-24
## 5 5 2020-06-25
## 6 6 2020-06-27
## 7 7 2020-06-28
## 8 8 2020-06-29
## 9 9 2020-06-30
## 10 10 2020-07-01
## 11 11 2020-07-02
## 12 12 2020-07-04
## 13 13 2020-07-05
We choose the variables that we want to see. Count the number of observations per variable.
## [1] "How many observations for S_Height_cm"
## 2020-06-17 2020-06-22 2020-06-23 2020-06-24 2020-06-25 2020-06-27 2020-06-28
## 125 125 125 125 125 125 125
## 2020-06-29 2020-06-30 2020-07-01 2020-07-02 2020-07-04 2020-07-05
## 125 125 125 125 125 125
## [1] "How many observations for S_Area_cmsquared"
## 2020-06-17 2020-06-22 2020-06-23 2020-06-24 2020-06-25 2020-06-27 2020-06-28
## 125 125 125 125 125 125 125
## 2020-06-29 2020-06-30 2020-07-01 2020-07-02 2020-07-04 2020-07-05
## 125 125 125 125 125 125
## [1] "How many observations for S_Perimeter_cm"
## 2020-06-17 2020-06-22 2020-06-23 2020-06-24 2020-06-25 2020-06-27 2020-06-28
## 125 125 125 125 125 125 125
## 2020-06-29 2020-06-30 2020-07-01 2020-07-02 2020-07-04 2020-07-05
## 125 125 125 125 125 125
## [1] "How many observations for S_Compactness"
## 2020-06-17 2020-06-22 2020-06-23 2020-06-24 2020-06-25 2020-06-27 2020-06-28
## 125 125 125 125 125 125 125
## 2020-06-29 2020-06-30 2020-07-01 2020-07-02 2020-07-04 2020-07-05
## 125 125 125 125 125 125
## [1] "How many observations for S_Width_cm"
## 2020-06-17 2020-06-22 2020-06-23 2020-06-24 2020-06-25 2020-06-27 2020-06-28
## 125 125 125 125 125 125 125
## 2020-06-29 2020-06-30 2020-07-01 2020-07-02 2020-07-04 2020-07-05
## 125 125 125 125 125 125
Check the genotypic layout at every timepoint.
## timeNumber timePoint
## 1 1 2020-06-17
## 2 2 2020-06-22
## 3 3 2020-06-23
## 4 4 2020-06-24
## 5 5 2020-06-25
## 6 6 2020-06-27
## 7 7 2020-06-28
## 8 8 2020-06-29
## 9 9 2020-06-30
## 10 10 2020-07-01
## 11 11 2020-07-02
## 12 12 2020-07-04
## 13 13 2020-07-05
Check the heatmap of the raw data at all the time points
Check some time courses of raw data
Using the SingleOut detect and single functions. We select a subset of plants to adjust the settings for the confIntSize and nnLocfit.
For all the traits
We can then run on all plants.
## [1] "S_Height_cm"
## timePoint
## 2020-06-17 2020-07-05
## 7 1
## [1] "S_Area_cmsquared"
## timePoint
## 2020-07-05
## 1
## No outlier for S_Perimeter_cm
## [1] "S_Compactness"
## timePoint
## 2020-06-17 2020-06-23
## 3 2
## [1] "S_Width_cm"
## timePoint
## 2020-06-17 2020-07-05
## 3 1
Check the heatmap of the data with outliers detection at all the time points.
## Aucun objet Single_outliers trouvé pour le trait S_Perimeter_cm
## No Single_outliers object found for trait S_Perimeter_cm
Fit a model for all time points with no extra fixed effects.
## 2020-06-17
## 2020-06-22
## 2020-06-23
## 2020-06-24
## 2020-06-25
## 2020-06-27
## 2020-06-28
## 2020-06-29
## 2020-06-30
## 2020-07-01
## 2020-07-02
## 2020-07-04
## 2020-07-05
## 2020-06-17
## 2020-06-22
## 2020-06-23
## 2020-06-24
## 2020-06-25
## 2020-06-27
## 2020-06-28
## 2020-06-29
## 2020-06-30
## 2020-07-01
## 2020-07-02
## 2020-07-04
## 2020-07-05
## 2020-06-17
## 2020-06-22
## 2020-06-23
## 2020-06-24
## 2020-06-25
## 2020-06-27
## 2020-06-28
## 2020-06-29
## 2020-06-30
## 2020-07-01
## 2020-07-02
## 2020-07-04
## 2020-07-05
## 2020-06-17
## 2020-06-22
## 2020-06-23
## 2020-06-24
## 2020-06-25
## 2020-06-27
## 2020-06-28
## 2020-06-29
## 2020-06-30
## 2020-07-01
## 2020-07-02
## 2020-07-04
## 2020-07-05
## 2020-06-17
## 2020-06-22
## 2020-06-23
## 2020-06-24
## 2020-06-25
## 2020-06-27
## 2020-06-28
## 2020-06-29
## 2020-06-30
## 2020-07-01
## 2020-07-02
## 2020-07-04
## 2020-07-05
## Output at: C:/Users/elise/Documents/Mémoire/Template/NaPPI/NaPPI_Template/testtemplate/S_Height_cm_mod.gif
## Output at: C:/Users/elise/Documents/Mémoire/Template/NaPPI/NaPPI_Template/testtemplate/S_Area_cmsquared_mod.gif
## Output at: C:/Users/elise/Documents/Mémoire/Template/NaPPI/NaPPI_Template/testtemplate/S_Perimeter_cm_mod.gif
## Output at: C:/Users/elise/Documents/Mémoire/Template/NaPPI/NaPPI_Template/testtemplate/S_Compactness_mod.gif
## Output at: C:/Users/elise/Documents/Mémoire/Template/NaPPI/NaPPI_Template/testtemplate/S_Width_cm_mod.gif
## 2020-06-17
## 2020-06-22
## 2020-06-23
## 2020-06-24
## 2020-06-25
## 2020-06-27
## 2020-06-28
## 2020-06-29
## 2020-06-30
## 2020-07-01
## 2020-07-02
## 2020-07-04
## 2020-07-05
cutoff <- 1
thrCor <- c(0.9)[cutoff] # correlation threshold
thrPca <- c(30)[cutoff] # pca angle threshold
thrSlope <- c(0.7)[cutoff] # slope threshold
Series_test <- detectSerieOut(corrDat = Spatial_Corrected,
predDat = predDat,
coefDat = coefDat,
trait = paste0(trait_name, "_corr"),
thrCor = thrCor,
thrPca = thrPca,
thrSlope = thrSlope,
geno.decomp = "geno.decomp")
## Warning: The following genotypes have less than 3 plotIds and are skipped in the outlier detection:
## EPPN07_L.Line, EPPN10_L.Line
plot(Series_test, genotypes = levels(factor(Series_test$genotype)))
Spatial_Corrected_Out <- Spatial_Corrected